ASR on speech reconstructed from short-time fourier phase spectra
نویسندگان
چکیده
In our earlier papers [1, 2], we have measured human intelligibility of speech stimuli reconstructed either from the short-time magnitude spectra (magnitude-only stimuli) or the short-time phase spectra (phase-only stimuli) of a speech stimulus. We demonstrated that, even for small analysis window durations of 20-40 ms (of relevance to automatic speech recognition), the short-time phase spectrum can contribute to speech intelligibility as much as the short-time magnitude spectrum. In this paper, we perform automatic speech recognition on magnitude-only and phase-only stimuli. When employing an MFCC-based front-end, the recognition achieved for these phase-only stimuli is much worse than magnitude-only stimuli at small analysis window durations, which is not consistent with their corresponding human intelligibility results. This implies that the MFCC feature set is not capturing all of the discriminating information present in the speech signal.
منابع مشابه
Iterative reconstruction of speech from short-time Fourier transform phase and magnitude spectra
In this paper, we consider the topic of iterative, one dimensional, signal reconstruction (specifically speech signals) from the magnitude spectrum and the phase spectrum. While this topic has been extensively researched and documented, we wish to recast some well-established results for the benefit of new researchers and those who desire a short, yet comprehensive, review of the subject. The t...
متن کاملSome experiments on iterative reconstruction of speech from STFT phase and magnitude spectra
In our earlier work, we have measured human intelligibility of stimuli reconstructed either from the short-time magnitude spectra or short-time phase spectra of a speech signal. We demonstrated that, even for small analysis window durations of 20-40 ms (of relevance to automatic speech recognition), the short-time phase spectrum can contribute to speech intelligibility as much as the short-time...
متن کاملOn the Use of Phase Information in Speech Recognition
This study addresses the use of short−time phase spectra in automatic speech recognition (ASR). Two recent studies have proposed two group delay based spectral representations. Here we propose three new group delay based representations and compare usefulness of all these representations in an ASR experiment. We show that two of the representations we propose perform better, contain equivalent ...
متن کاملUsefulness of Phase Spectrum in H
Short-time Fourier transform of speech signal has two components: magnitude spectrum and phase spectrum. In this paper, relative importance of short-time magnitude and phase spectra on speech perception is investigated. Human perception experiments are conducted to measure intelligibility of speech tokens synthesized either from magnitude spectrum or phase spectrum. It is traditionally believed...
متن کاملUsefulness of phase in human speech perception
Short-time Fourier transform of speech signal has two components: magnitude spectrum and phase spectrum. In this paper, relative importance of short-time magnitude and phase spectra on speech perception is investigated. Human perception experiments are conducted to measure intelligibility of speech tokens synthesized either from magnitude spectrum or phase spectrum. It is traditionally believed...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004